A transformation is diagonlizable if it has a diagonal representation with respect to the same basis for the codomain as domain.
A diagonalizable matrix is any matrix T such that PTP−1 is diagonal for a nonsingular P
Example 15.1
T=⎝⎜⎛62−6−111−5−1−17⎠⎟⎞
is diagonalizable using P=⎝⎜⎛1/2−1/2−1/21/41/4−3/41/41/41/4⎠⎟⎞P−1=⎝⎜⎛102−1110−11⎠⎟⎞
to get this D=PTP−1 D=⎝⎜⎛4000800012⎠⎟⎞
Example 15.2
Consider if the following is diagonalizable N=(0100)
Since N2 is the zero matrix, for any map n represented by N with bases B,B, the composition n∘n is the zero map. This implies that any representation of n with some bases B^,B^ has a square that is the zero map. However, if there was some diagonal matix D similar to N, then we must have D2 have diagonal entries be the square of the entries of D, so cannot be a zero matrix. Thus, N is not diagonalizable.
Diagonalization Theorem:
A transformation t is diagonalizable iff there is a basis B=⟨β1,...,βn⟩ and scalars λ1,...,λn such that t(βi)=λiβi for each i.
Proof
If B=⟨β1,...,βn⟩ is a basis then t(βi)=λiβi⟺RepB(t(βi))=λiRepB(βi)
which is equivalent to RepB(t(βi))=λi⎝⎜⎜⎜⎜⎜⎜⎜⎛0⋮1⋮0⎠⎟⎟⎟⎟⎟⎟⎟⎞=⎝⎜⎜⎛λ1⋮0⋯⋱⋯0⋮λn⎠⎟⎟⎞⎝⎜⎜⎜⎜⎜⎜⎜⎛0⋮1⋮0⎠⎟⎟⎟⎟⎟⎟⎟⎞=⎝⎜⎜⎜⎜⎜⎜⎜⎛0⋮λi⋮0⎠⎟⎟⎟⎟⎟⎟⎟⎞
Example 15.3
T=(401−1) is not diagonal, but a similar matrix can be found that is diagonal.
If T=RepE2,E2(t) for t:R2→R2, we want to find a basis B=⟨β1,β2⟩ such that RepB,B(t)=(λ100λ2)
We want (401−1)β1=λ1β1 and (401−1)β2=λ2β2
Or, converting βk to a column vector, we want a scalar x∈C such that (401−1)(b1b2)=x(b1b2)
has a solution for b1,b2 that are not both zeroes.
Rewriting this as a linear system: (4−x)b1+b2(−1−x)b2=0=0
If −1−x=0 and 4−x=0, then the only solution is that b1=b2=0. So, one solution results from −1−x=0→x=−1, resulting in b1=(−1/5)b2, and the other solution from 4−x=0→x=4, resulting in b2=0.
Thus, our basis is B=⟨(−15),(10)⟩, which is diagonalizable to D=(−1004)
Application of Diagonalization: Powers of Matrices
Using diagonalization, it is easy to compute powers of matrices. If A=D is diagonal, then Dn is also diagonal, where the entries are the nth powers of the entries of D.
If A is diagonalizable, then AA2A3An===⋮=PDP−1(PDP−1)(PDP−1)=PD(P−1P)DP−1=PDIDP−1=PD2P−1(PDP−1)(PD2P−1)=PD(P−1P)D2P−1=PDID2P−1=PD3P−1PDnP−1
Since D is diagonal, PDnP−1 is much easier to compute than An
Example 15.4
Let A=(1−124). Find An.
We first evaluate P and D.
We must find x∈C such that there are solutions to (1−124)(b1b2)=x(b1b2)
that are not b1=b2=0.
The associated linear system is (1−x)b1−b1++2b2=0(4−x)b2=0
Substituting for b1=(4−x)b2 (1−x)(4−x)b2+2b2=(x2−5x+6)b2=0
For b2 to be nonzero, we need x2−5x+6=0, which is true when x=2,3.
When x=2, we have b1=2b2, and if x=3, we have b1=b2. Thus, our basis is B=⟨(21),(11)⟩ and diagonal matrix D=(2003).
From this, we generate P=(2111)P−1=(1−1−12)
Thus, we have An=PDnP−1=(2111)(2n003n)(1−1−12)=(2n+1−3n2n−3n−2n+1+2⋅3n−2n+2⋅3n)
Eigenvalues and Eigenvectors
A transformation t:V→V has a scalar eigenvalueλ if there is a nonzero eigenvectorζ∈V such that t(ζ)=λ⋅ζ.
Similarly, a square matrix T has a scalar eigenvalueλ associated with the nonzero eigenvectorζ if Tζ=λ⋅ζ
Example 15.5
The matrix D=(4002)
has an eigenvalue λ1=4 and λ2=2.
The first is associated with eigenvector e1 (4002)(10)=4⋅(10)
The second is asociated with e2 (4002)(01)=2⋅(01)
These matrices act on the specific vectors by rescaling.
Computing Eigenvalues and Eigenvectors
See the example below.
Example 15.6
Consider the matrix T=⎝⎜⎛0−2−1571774⎠⎟⎞
We want to find scalars x such that Tζ=xζ for some nonzero ζ. Bring to the terms to the left ⎝⎜⎛0−2−1571774⎠⎟⎞⎝⎜⎛z1z2z3⎠⎟⎞−x⎝⎜⎛z1z2z3⎠⎟⎞=0
And factor out the vector ⎝⎜⎛0−x−2−157−x1774−x⎠⎟⎞⎝⎜⎛z1z2z3⎠⎟⎞=⎝⎜⎛000⎠⎟⎞
This has nonzero solutions iff the matrix is singular, i.e. has a determinant of zero. 0=∣∣∣∣∣∣∣−x−2−157−x1774−x∣∣∣∣∣∣∣=−(x−5)(x−4)(x−2)
So the eigenvalues of λ1=5,λ2=4,λ3=2
To find the associated eigenvectors, substitute each and solve.
For λ1=5 ⎝⎜⎛−5−2−152177−1⎠⎟⎞⎝⎜⎛z1z2z3⎠⎟⎞=0
Gauss's Method yields the solutions to be V5={⎝⎜⎛110⎠⎟⎞z2∣z2∈C}
For λ2=4 ⎝⎜⎛−4−2−1531770⎠⎟⎞⎝⎜⎛z1z2z3⎠⎟⎞=0
The solutions are V4={⎝⎜⎛−7−71⎠⎟⎞z3∣z3∈C}
For λ3=2 ⎝⎜⎛−2−2−1551772⎠⎟⎞⎝⎜⎛z1z2z3⎠⎟⎞=0
gives V2={⎝⎜⎛1−11⎠⎟⎞z3∣z3∈C}
If the matrix is an n×n upper or lower triangular matrix with diagonal entries λ1,...,λn, then its eigenvalues are λ1,...,λn.
Two similar matrices have the same eigenvalues but different eigenvectors because the action is the same, but in different bases.
The characteristic polynomial of a square matrix T is the determinant ∣T−xI∣ where x is a variable. The characteristic equation is ∣T−xI∣=0
For an upper or triangular n×n matrix with diagonal entries λ1,...,λn, the characteristic polynomial is (λ1−x)⋯(λn−x)
The eigenvalues of a square matrix are exactly the roots of the characteristic polynomial.
Proof
λ is an eigenvalue of T iff there exists a nonzero v such that Tv=λv, or equivalently, (T−λI)v=0. Since v=0, we must have (T−λI) be nonintertible, which is equivalent to ∣T−λI∣=0
Two similar matrices have the same characteristic polynomial. In particular, they have the same eigenvalues.
Proof
Suppose T1 and T2 are similar and P exists such that T2=PT1P−1.
Then T2−xI=PT1P−1−xI=PT1P−1−xPP−1=PT1P−1−P(xI)P−1=P(T1−xI)P−1
From T2−xI=P(T1−xI)P−1, we obtain that ∣T2−xI∣=∣P(T1−xI)P−1∣=∣P∣∣T1−xI∣∣P−1∣=∣T1−xI∣
Since similar matrices are representations of the same map with respect to different bases, we define the characteristic polynomial of a linear mapt:V→V as the characteristic polynomial of any matrix representation RepB,B(t) where B is a basis of V.
Note if t or the matrix representation T is of dimension n, the characteristic polynomial has degree n.
A nontrivial linear transformation has a characteristic polynomial with degree at least one, so must have at least one solution over the complex numbers. So, a nontrivial linear transformation must have at least one eigenvalue.
The eigenspace of a linear transformation t associated with eigenvalue λ is Vλ={ζ∣t(ζ)=λζ}. The eigenspace of a square matrix is analogous.
An eigenspace is a nontrivial subspace.
Proof
First, it is nonempty, since t(0)=λ⋅0. Because ζ cannot be 0, Vλ must have more than just 0
Now, we must show closure under addition and multiplication. t(c1ζ1+⋯+cnζn)==c1t(ζ1)+⋯+cnt(ζn)c1λζ1+⋯+cnλζn&=λ(c1ζ1+⋯+cnζn)
For any set of distinct eigenvalues, the set of associated eigenvectors (one per eigenvalue) is linear independent.
Proof
We use induction. The base case is 0 eigenvalues, in which case the set is empty, so is linearly independent.
Assume it holds true for some k≥0 and consider eigenvalues λ1,...,λk+1 with associated eigenvectors v1,...,vk+1.
Suppose 0=c1v1+⋯+ckvk+ck+1vk+1
Derive two equations: one from multiplying λk+1 on both sides and the other from applying the map 0=c1λk+1v1+⋯+ckλk+1vk+ck+1λk+1vk+1 0=c1λ1v1+⋯+ckλkvk+vk+1λk+1vk+1
and subtract 0=c1(λk+1−λ1)v1+⋯+ck(λk+1−λk)vk
The vk+1 term vanishes. By the indutive hypothesis, ci(λk+1−λi)=0 for i=1,...,k
However, since the eigenvalues are distinct, we must have c1=c2=⋯=ck=0. From this and the original equation, we have ck+1vk+1 so ck+1=0 as well.
An n×n matrix with n distinct eigenvalues is diagonalizable.
Proof
Call the n distinct eigenvalues λ1,...,λn. For i=1,...,n we can find a non-zero eigenvector ζi∈Cn for the eigenvalue λi. From the previous theorem, they are all linearly independent, so they form a basis for Cn. Since this is a basis of eigenvectors, the matrix is diagonalizable.
Finally, we consider what happens if we have an n×n matrix T with real coefficients.
If λ∈R is an eigenvalue then the matrix T−λI is non-invertible and with real entries, then its nullspace must contain a nonzero real vector.
Therefore, for a real eigenvalue of a real n×n matrix we can always find a eigenvector in Rn. On the other hand, if the eigenvalue λ for the real matrix T is not in R, then no eigenvector for λ is in Rn.
Notes
Matrices with the same eigenvalues need not be similar.
The characteristic polynomial of a 2×2 matrix A is f(x)=x2−Tr(A)+det(A)
Let λ be an eigenvalue of a square matrix A.
The geometric multiplicity of λ is the dimension of the λ-eigenspace. The algebraic multiplicity of λ is the multiplicity as a root of the characteristic polynomial.
Note that under complex scalars, the sum of algebraic multiplicities is n.
We can find this relationship about them: 1≤geometric multiplicity of λ≤algebraic multiplicity of λ
This means if the algebraic multilplicity is 1, then the geometric multiplicity is also 1.
Alternate Diagonalization Theorem:
Let A be an n×n matrix. The following are equivalent:
A is diagonalizable
the sum of the geometric multiplicities of the eigenvalues of A equal n
for each eigenvalue, the geometric multiplicity equals the algebraic multiplicity
Trick for computing eigenvectors of 2×2 matrices
Let A be a 2×2 matrix with an eigenvalue λ.
We know det(A−λI)=0⟹A−λI is not invertible, so the two rows are linearly dependent. So, if the first row is not 0s, the second row is a scalar multiple of the first. Hence, in reduced row echelon form, the second row disappears, so it doesn't matter what it is.
If A−λI=(a∗b∗), then (−ba) and (b−a) are both eigenvectors. Similar if the first row is 0s.
If A−λI=(0000), then every non-zero vector is an eigenvector.
Conjugate eigenvalues correspond to conjugate eigenvectors; i.e. if v is an eigenvector with eigenvalue λ, vˉ is an eigenvector with eigenvalue λ because Av=λv⟹Av=Av=λv=λv
Let A be a 2×2 matrix with complex, non-real eigenvalue λ and eigenvector v. Then A=PCP−1 where P=⎝⎜⎛∣R(v)∣∣I(v)∣⎠⎟⎞ and C=(R(λ)−I(λ)I(λ)R(λ))
The matrix C is a composition of rotation by −arg(λ) and scaled by ∣λ∣ C=(∣λ∣00∣λ∣)(cos(−arg(λ))sin(−arg(λ))−sin(−arg(λ))cos(−arg(λ)))
Thus, a 2×2 matrix with complex eigenvalue λ is similar to (rotation by argument of λ) composed with (scaled by ∣λ∣), which is multiplication by λ in C∼R2
Examples
Diagonalization of a 3×3 matrix
Example 15.7
Diagonalize A=⎝⎜⎛421−3−1−1001⎠⎟⎞
First find the characteristic polynomial f(x)=det(A−xI)=(1−x)((4−x)(−1−x)+6)=−(x−1)2(x−2)
The eigenvalues are 1 and 2 with multiplicities 2 and 1, respectively.
First compute the 1-eigenspace: ⎝⎜⎛321−3−2−1000⎠⎟⎞v=0Gaussian⎝⎜⎛100−100000⎠⎟⎞v=0
Therefore, v=⎝⎜⎛xyz⎠⎟⎞=y⎝⎜⎛110⎠⎟⎞+z⎝⎜⎛001⎠⎟⎞ and a basis for the 1-eigenspace is B1=⟨⎝⎜⎛110⎠⎟⎞,⎝⎜⎛001⎠⎟⎞⟩
For the 2-eigenspace: ⎝⎜⎛221−3−3−100−1⎠⎟⎞v=0Gaussian⎝⎜⎛100010−3−20⎠⎟⎞v=0
Therefore, a basis for the 2-eigenspace is B2=⟨⎝⎜⎛321⎠⎟⎞⟩
The three basis vectors in B1 and B2 are linearly independent, so we have A=PDP−1 for P=⎝⎜⎛110001321⎠⎟⎞,D=⎝⎜⎛100010002⎠⎟⎞
Non-Diagonalizable Matrices
Example 15.8
The matrix A=(1001) has only one eigenvalue, 1.
Computing the 1-eigenspace, (A−I)v=(0010)v=0
The eigenspace only contains multiples of (10). Since we need 2 linearly independent vectors for this to be diagonalizable, this is not.
Non-real eigenvalue
Example 15.9
Consider A=(11−11)
The eigenvalues are 1±i, a conjugate pair.
Let λ=1−i. We compute an eigenvector v: A−λI=(i∗−1∗)→v=(1i)
Therefore, P=(R(1i)I(1i))=(1001)C=(R(λ)−I(λ)I(λ)R(λ))=(11−11)
The matrix C=A scales by a factor of ∣λ∣=12+12=2, and the argument is −π/4, therefore C=A rotates by +π/4 (note that A is 2 times the rotation matrix by π/4, so this matches)
If A=C, then C scales by ∣λ∣ and rotates by −arg(λ) in the standard basis, but A does that in the basis formed by the columns of P, i.e. the eigenvectors.
This is similar to the real eigenvalue case, where in A=PDP−1, A scales like D, but under the basis formed from the columns of P
Let A be a real n×n matrix. Suppose each eigenvalue (real or complex), the geometric multiplicity equals the algebraic multiplicity. Then A=PCP−1, where:
C is block diagonal, where the blocks are either a 1×1 block containing the real eignevalue λk or a 2×2 block containing the matrix (Rλ−IλIλRλ) for non-real eigenvector λ
The columns of P form bases for the eigenspaces for the real eigenvectors or come in pairs (Rv,Iv) for non-real eigenvectors v
Example 15.10
Let A=⎝⎜⎛110−110002⎠⎟⎞. This is block diagonal. It acts on the xy-plane by scaling by 2 and rotationg by π/4. It acts on the z-axis by scaling by 2.